I. Introduction

Table 1.Sample for 5 randomly chosen countries of the data set used in this study
Country cumulative_confirmed_cases_per_10000 Stringency_Index Economic_Support_Index Economic_Support_Index_levels
Fiji 0.3708061 78.70 75 [75,87.5)
Mexico 69.0394942 70.83 0 [0,12.5)
Pakistan 15.1022797 61.11 75 [75,87.5)
Zambia 9.0112384 39.81 25 [25,37.5)
United Arab Emirates 125.1447081 72.22 50 [50,62.5)
Country Population2019 age15_64_population_prop_2019 nurses_midwives_per_1000_2018 Smoking_prevalence_15_2016
Fiji 889953 65.08209 3.3752 22.6
Mexico 127575529 66.39822 2.3961 14.0
Pakistan 216565318 60.62321 0.6683 20.1
Zambia 17861030 53.42218 1.3376 13.8
United Arab Emirates 9770529 84.13084 5.7271 28.9

II. Exploratory data analysis


Table 2: Summary for the cumulative confirmed cases per 10,000
n min median mean max sd
78 0.0334753 27.50465 67.20095 461.5392 92.11358
Figure 1. Distribution for the cumulative confirmed cases per 10,000 for individual countries

Figure 1. Distribution for the cumulative confirmed cases per 10,000 for individual countries

Figure 2. Distribution for the government response measured by the Stringency Index

Figure 2. Distribution for the government response measured by the Stringency Index

Figure 3. Distribution for the government response measured by the Economic Support Index

Figure 3. Distribution for the government response measured by the Economic Support Index

Figure 4. Distribution for the Proportion of population that is 15-64 years old, in 2019 for individual countries

Figure 4. Distribution for the Proportion of population that is 15-64 years old, in 2019 for individual countries

Figure 5. Distribution for nurses and midwives per 1000 in 2018 for individual countries

Figure 5. Distribution for nurses and midwives per 1000 in 2018 for individual countries

Figure 6. Distribution for the Smoking Prevalence for 15+ years olds, in 2016 for individual countries

Figure 6. Distribution for the Smoking Prevalence for 15+ years olds, in 2016 for individual countries

Figure 7.1. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index. The red line is the best fit line. The blue curve is the Loess curve.

Figure 7.2. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index, grouped by the Economic Support Index levels

Figure 7.2. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index, grouped by the Economic Support Index levels

Figure 7.3. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index, grouped and divided by the Economic Support Index levels

Figure 7.3. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Stringency Index, grouped and divided by the Economic Support Index levels

Figure 8. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their government response measured by the Economic Support Index. The red line is the best fit line. The blue curve is the Loess curve.

Figure 9. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their Proportion of population that is 15-64 years old, in 2019. The red line is the best fit line. The blue curve is the Loess curve.

Figure 10. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their Smoking prevalence for 15+ year olds in 2016. The red line is the best fit line. The blue curve is the Loess curve.

Figure 11. Interactive Scatterplot for the cumulative confirmed cases per 10,000 for individual countries against their Service coverage index in 2017. The red line is the best fit line. The blue curve is the Loess curve.

Figure 12. Boxplot of relationship between  the cumulative confirmed cases per 10,000 for individual countries and the Economic Support Index levels

Figure 12. Boxplot of relationship between the cumulative confirmed cases per 10,000 for individual countries and the Economic Support Index levels


III. Multiple linear regression

i. Methods


Will use the transformed Y, since the previous report showed non-normality in error terms and skewness.
Figure 13. Distribution for the cumulative confirmed cases per 10,000 raised to 0.2, for individual countries

Figure 13. Distribution for the cumulative confirmed cases per 10,000 raised to 0.2, for individual countries

Table 3: Correlation matrix for the numeric variables in the study

CCCPTTH CCCPTTH^0.2 2019 Population SI ESI 15 to 64 y/o 2019 population proportion NM 2018 SP 2016
cumulative_confirmed_cases_per_10000 1.000 0.857 -0.039 0.335 0.121 0.574 0.350 -0.023
cumulative_confirmed_cases_per_10000_transf 0.857 1.000 0.039 0.404 0.165 0.541 0.393 -0.031
Population2019 -0.039 0.039 1.000 0.081 0.035 0.033 -0.082 -0.075
Stringency_Index 0.335 0.404 0.081 1.000 -0.136 0.204 -0.315 -0.261
Economic_Support_Index 0.121 0.165 0.035 -0.136 1.000 0.061 0.293 0.061
age15_64_population_prop_2019 0.574 0.541 0.033 0.204 0.061 1.000 0.324 0.243
nurses_midwives_per_1000_2018 0.350 0.393 -0.082 -0.315 0.293 0.324 1.000 0.325
Smoking_prevalence_15_2016 -0.023 -0.031 -0.075 -0.261 0.061 0.243 0.325 1.000

Using natural splines on the following model: \[ \begin{aligned}\widehat{Y}_{CCPTTH}^{0.2} =& b_{0} + b_{SI} \cdot (x_1) + b_{ESI} \cdot (x_2) + b_{15to65 APP} \cdot (x_{3}) \\ & + b_{NM,} \cdot (x_{4}) + b_{SP} \cdot (x_{12}) \end{aligned} \]

Figure 14. Normal Q-Qplot for the model under discussion

Figure 14. Normal Q-Qplot for the model under discussion

Figure 15. Residuals distribution for the statistical model

Figure 15. Residuals distribution for the statistical model

Figure 16. Residuals graph for the fitted values, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 16. Residuals graph for the fitted values, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 17. Residuals graph for the Stringency Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 17. Residuals graph for the Stringency Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Economic Support Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 18. Residuals graph for the Economic Support Index, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 19. Residuals graph for the Proportion of population that is 15-64 years old, in 2019, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 19. Residuals graph for the Proportion of population that is 15-64 years old, in 2019, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 20. Residuals graph for the Smoking prevalence for 15+ year olds in 2016, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 20. Residuals graph for the Smoking prevalence for 15+ year olds in 2016, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 21. Residuals graph for the nurses and midwives per 1000 in 2018, with a Lowess curve in blue and a horizontal line at zero in red.

Figure 21. Residuals graph for the nurses and midwives per 1000 in 2018, with a Lowess curve in blue and a horizontal line at zero in red.

Table 4: VIF table

##                                                        GVIF Df GVIF^(1/(2*Df))
## ns(Stringency_Index, knots = c(25, 50, 75))        1.689995  4        1.067790
## Economic_Support_Index                             1.181773  1        1.087094
## ns(age15_64_population_prop_2019, knots = c(67.5)) 1.673899  2        1.137450
## ns(nurses_midwives_per_1000_2018, knots = c(10))   2.150052  2        1.210911
## Smoking_prevalence_15_2016                         1.324239  1        1.150756

ii. Model Results


Table 5. Model Summary Table

## 
## Call:
## lm(formula = cumulative_confirmed_cases_per_10000_transf ~ ns(Stringency_Index, 
##     knots = c(25, 50, 75)) + Economic_Support_Index + ns(age15_64_population_prop_2019, 
##     knots = c(67.5)) + ns(nurses_midwives_per_1000_2018, knots = c(10)) + 
##     Smoking_prevalence_15_2016, data = tidy_joined_dataset)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.37985 -0.23483  0.04806  0.24919  0.95877 
## 
## Coefficients:
##                                                      Estimate Std. Error
## (Intercept)                                         -0.124244   0.507311
## ns(Stringency_Index, knots = c(25, 50, 75))1         0.889603   0.501310
## ns(Stringency_Index, knots = c(25, 50, 75))2         1.518347   0.388205
## ns(Stringency_Index, knots = c(25, 50, 75))3         2.248856   0.978427
## ns(Stringency_Index, knots = c(25, 50, 75))4         1.154782   0.385021
## Economic_Support_Index                               0.002778   0.002301
## ns(age15_64_population_prop_2019, knots = c(67.5))1  1.455291   0.490662
## ns(age15_64_population_prop_2019, knots = c(67.5))2  0.913580   0.376060
## ns(nurses_midwives_per_1000_2018, knots = c(10))1    1.756999   0.388970
## ns(nurses_midwives_per_1000_2018, knots = c(10))2    1.122037   0.402862
## Smoking_prevalence_15_2016                          -0.012081   0.006710
##                                                     t value Pr(>|t|)    
## (Intercept)                                          -0.245 0.807277    
## ns(Stringency_Index, knots = c(25, 50, 75))1          1.775 0.080514 .  
## ns(Stringency_Index, knots = c(25, 50, 75))2          3.911 0.000217 ***
## ns(Stringency_Index, knots = c(25, 50, 75))3          2.298 0.024664 *  
## ns(Stringency_Index, knots = c(25, 50, 75))4          2.999 0.003796 ** 
## Economic_Support_Index                                1.207 0.231524    
## ns(age15_64_population_prop_2019, knots = c(67.5))1   2.966 0.004178 ** 
## ns(age15_64_population_prop_2019, knots = c(67.5))2   2.429 0.017814 *  
## ns(nurses_midwives_per_1000_2018, knots = c(10))1     4.517 2.61e-05 ***
## ns(nurses_midwives_per_1000_2018, knots = c(10))2     2.785 0.006950 ** 
## Smoking_prevalence_15_2016                           -1.801 0.076279 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4869 on 67 degrees of freedom
## Multiple R-squared:  0.5801, Adjusted R-squared:  0.5174 
## F-statistic: 9.256 on 10 and 67 DF,  p-value: 2.047e-09
Table 6. ANOVA Table
Df Sum Sq Mean Sq F value Pr(>F)
ns(Stringency_Index, knots = c(25, 50, 75)) 4 6.6653288 1.6663322 7.030250 0.0000869
Economic_Support_Index 1 2.0693419 2.0693419 8.730547 0.0043147
ns(age15_64_population_prop_2019, knots = c(67.5)) 2 7.4160229 3.7080115 15.644088 0.0000027
ns(nurses_midwives_per_1000_2018, knots = c(10)) 2 5.0192683 2.5096342 10.588138 0.0001010
Smoking_prevalence_15_2016 1 0.7684023 0.7684023 3.241887 0.0762790
Residuals 67 15.8805528 0.2370232 NA NA

Table 7. The 95% Confidence Intervals

2.5 % 97.5 %
(Intercept) -1.1368400 0.8883524
ns(Stringency_Index, knots = c(25, 50, 75))1 -0.1110164 1.8902222
ns(Stringency_Index, knots = c(25, 50, 75))2 0.7434856 2.2932084
ns(Stringency_Index, knots = c(25, 50, 75))3 0.2959070 4.2018059
ns(Stringency_Index, knots = c(25, 50, 75))4 0.3862767 1.9232864
Economic_Support_Index -0.0018145 0.0073707
ns(age15_64_population_prop_2019, knots = c(67.5))1 0.4759263 2.4346558
ns(age15_64_population_prop_2019, knots = c(67.5))2 0.1629620 1.6641978
ns(nurses_midwives_per_1000_2018, knots = c(10))1 0.9806114 2.5333872
ns(nurses_midwives_per_1000_2018, knots = c(10))2 0.3179212 1.9261522
Smoking_prevalence_15_2016 -0.0254745 0.0013117

iii. Interpreting the regression table

Our model is the following:

\[ \begin{aligned}\widehat{Y}_{CCPTTH}^{0.2} =& b_{0} + b_{SI,0-25} \cdot f_{1}(x_1) + b_{SI,25-50} \cdot f_{2}(x_1) + b_{SI,50-75} \cdot f_{3}(x_1) \\ & + b_{SI,75-100} \cdot f_{4}(x_1) + b_{ESI} \cdot (x_2) + b_{15to65 APP,50-67.5} \cdot f_{5}(x_{3}) \\ & + b_{15to65 APP,67.5-85} \cdot f_{6}(x_{3}) + b_{NM,0-10} \cdot f_{7}(x_{4}) \\ & + b_{NM,10-20} \cdot f_{8}(x_{4}) + b_{SP} \cdot (x_{12}) \\ = & -0.124 + 0.8896 \cdot f_{1}(x_1) + 1.518347 \cdot f_{2}(x_1) + 2.2489 \cdot f_{3}(x_1) \\ & + 1.1548 \cdot f_{4}(x_1) - 0.0028 \cdot (x_2) + 1.4553 \cdot f_{5}(x_{3}) \\ & + 0.9136 \cdot f_{6}(x_{3}) + 1.756999 \cdot f_{7}(x_{4}) \\ & + 1.12204 \cdot f_{8}(x_{4}) - 0.0121 \cdot (x_{12}) \end{aligned} \]

\[\begin{aligned} H_0:&\beta_{0} = 0 \\\ \mbox{vs }H_A:& \beta_{0} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SI, 0-25} = 0 \\\ \mbox{vs }H_A:& \beta_{SI, 0-25} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SI, 25-50} = 0 \\\ \mbox{vs }H_A:& \beta_{SI, 25-50} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SI, 50-75} = 0 \\\ \mbox{vs }H_A:& \beta_{SI,50-75} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{SI, 75-100} = 0 \\\ \mbox{vs }H_A:& \beta_{SI, 75-100} \neq 0 \end{aligned}\]

\[\begin{aligned} H_0:&\beta_{ESI} = 0 \\\ \mbox{vs }H_A:& \beta_{ESI} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{15to65 APP, 50-67.5} = 0 \\\ \mbox{vs }H_A:& \beta_{15to65 APP, 50-67.5} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{15to65 APP, 67.5-80} = 0 \\\ \mbox{vs }H_A:& \beta_{15to65 APP, 67.5-80} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{NM, 0-10} = 0 \\\ \mbox{vs }H_A:& \beta_{NM, 0-10} \neq 0 \end{aligned}\] \[\begin{aligned} H_0:&\beta_{NM, 10-20} = 0 \\\ \mbox{vs }H_A:& \beta_{NM, 10-20} \neq 0 \end{aligned}\]

\[\begin{aligned} H_0:&\beta_{SP} = 0 \\\ \mbox{vs }H_A:& \beta_{SP} \neq 0 \end{aligned}\]

iv. Inference for multiple regression

Table 8. The 95% Prediction intervals for the cumulative confirmed cases per 10,000, where Stringency Index = 20, 50, 70, 90, respectively, for \((\text{cumulative confirmed cases per 10,000})^{0.2}\) = 2, economic support index = 50, population proportion of ages 15 to 64 in 2019 = 65, nurses midwives per 1000 in 2018 = 5, and Smoking prevalence for people ages 15+ in 2016 = 25.

SI Point Estimate Lower Limit Upper Limit
20 0.01993 -1.35387 30.1566
50 11.46058 0.09172 127.5753
70 41.02339 1.66196 284.8189
90 57.59487 2.88898 369.6364

IV. Discussion

i. Conclusions

ii. Limitations

iii. Further questions


V. Citations and References